Master ' s Thesis in Computational Linguistics Jan

نویسنده

  • Jan Lundberg
چکیده

Cluster analysis is the name for a group of multivariate techniques whose primary purpose is to group objects based on the characteristics they possess. Clustering has been applied in many contexts and by researchers in many disciplines. This reflects its broad appeal and usefulness as one of the steps in exploratory data analysis. In this thesis I explore cluster analysis as a means to investigate relationships among speech samples extracted from the Swedia 2000 dialect database. By using mel frequency cepstral coefficients (MFCC) to represent the data, I also show that the cepstrum is a reliable metric for measuring acoustic distance. The methods applied in this thesis may contribute to the validation of dialect distances, and finding the optimal number of clusters in a dialect data set. The results show that this approach may represent an effective tool to support speech analysis applications.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Investigation of Interactional Metadiscourse in Discussion and Conclusion Sections of Social and Natural Science Master Theses

This study is a corpus-based study of interactional metadiscourse in natural and social science master theses. For this purpose, 30 natural and social science master theses in six disciplines were randomly selected out of the library of five universities. Five master theses were selected in each discipline, in a period of six years (2010-2016).This study analyzed only the discussion and conclus...

متن کامل

A Functional Investigation of Self-mention in Soft Science Master Theses

This study is a quantitative and functional corpus-based study of self-mention in soft science Master theses. One important purpose of this study was to find out the functions of self-mention in soft science Master theses. For this purpose, 20 soft science Master theses in four disciplines (Applied linguistics, Psychology, Geography, and Political sciences), were randomly selected out of the li...

متن کامل

Creating a Bilingual Dictionary using Wikipedia

Acknowledgements I am especially grateful to my supervisor dr. Daniel Zeman for providing me a wise supervision, helpful advices, instant replies and encouragement. I am happy to have a chance to acknowledge the work of my coordinators from Charles University in Prague, for being in touch with me since the first year of Masters program, for being patient, kind, very attentive and helpful with a...

متن کامل

Ordering information on distributions

This thesis details a class of partial orders on the space of probability distributions and the space of density operators which capture the idea of information content. Some links to domain theory and computational linguistics are also discussed. Chapter 1 details some useful theorems from order theory. In Chapter 2 we define a notion of an information ordering on the space of probability dist...

متن کامل

Functional analysis of Subject and Verb in Theses Abstracts on Applied Linguistics

The purpose of the present study is to analyse abstracts related to Applied Linguistics, and more precisely the discourse functions of grammatical subjects and verbs. The corpus consisted of 50 PhD thesis abstracts written on the subject of Applied Linguistics. All of the abstracts were written from 2010 to 2014. The theses from which the abstracts were extracted are available in the ProQuest d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005